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We, the undersigned, hereby declare and state as follows: 

1. We are the named inventors on the above-referenced U.S. patent 

application 

2. We conceived the invention that is the subject matter of one or more 
claims of the above-referenced application at least as early as November 14, 2002.. On or 
about November 14, 2002, we prepared a Disclosure of Invention entitled Updating 
XML Views of Relational Data A copy of the Disclosure of Invention is attached hereto 
as Exhibit 1 

3 A related paper entitled Updating XML Views of Relational Data 
(Paper Number 332) was attached to the Disclosure of Invention form The papei was 
also submitted for publication in ACM SIGMOD 2003 A copy of the paper is attached 
her eto as Exhibit 2 

4, The invention was reduced to practice by implementing it in software 
code prior to or in conjunction with the preparation of the paper The software code 
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embodying the invention was used to obtain the initial expeiimental results referred to in 
Section 4.4 (entitled Implementation) of the cited paper Ihe cited test xesults indicating 
conect operation are evidence of an actual reduction to pr actice of one embodiment of 
the present invention 

5 The implemented algorithm contained an Information Collection 
Module and a View-Update Execution Module, See, Par 4 4 

Information- Collection Module 

6 The Information Collection Module collects the static information 
described in Section 4 2 of the paper Ihe view-relationship graph is then translated into 
update plans that are persisted in the system and later used at run time Id 

7 Claim 1 generally requires assigning at least one of a plurality of 
categories to each of the nodes ihe plurality of categories are based on a cardinality 
relationship indicated by one or more correlation predicates and one or more foreign key 
constr aints in the relational database. 

8. Among other things, the implementation collects the category of the 
node.. See, Section 4.2 Ihis is obtained from the view query that defines the XML view 
and the algorithms defined in Section 3 Algorithm 1 in section 3 describes how the node 
is categorized 

9 As stated in section 3 2, the implemented node categorization is based 
upon cardinality relationships indicated by foreign- key constraints in the underlying 
relational database 

10. Ihe implementation of the Information-Collection Module was a 
reduction to practice of a software program that assigned at least one of a plurality of 
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categories to each of the nodes, where the plurality of categories are based on a 
cardinality relationship indicated by one oi more con elation predicates and one or more 
foreign key constraints in the relational database. 

View-Update Execution Module 

11.. The View-Update Execution Module provides the interface for 
deletion, insertion, movement, and leplacement (deleting the old node and insetting the 
new node in one transaction) on a given XML node at inn time The execution module 
interacts with the relational database and the DOM interface to access the underlying data 
for the XML view See, Par 4.4. 

12 Claim 1 generally requires determining whether the update to the 
XML document can be updated in the undeilying relational database based on the 
assigned category 

13 The implementation of Algorithm 2, discussed in section 3 3, provides 
an example of how to tianslate the deletion of a node from m XML view document into 
base view updates, and the categories for which such deletions cannot be pei formed 
Generally, Algorithm 2 shows how a deletion is performed for each category. 

14. The implementation of the View-Update Execution Module was a 
reduction to practice of a software program that determined whethet an update to the 
XML document can be updated in the underlying relational database based on an 
assigned category 

15 Claims 12 and 16 have similai limitations to claim 1 

16. The work noted in this Affidavit took place at the Murray Hill facility 
of Lucent Technologies in Murray Hill, New Jersey 
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of how to tr anslate the deletion of a node fiom an XML view document into base view 
updates, and the categories for which such deletions cannot be peiformed . Generally, 
Algorithm 2 shows how a deletion is perfoimed for each category 

14 . The implementation of the View-Update Execution Module was a reduction to 
pr actice of a software progr am that determined whether an update to the XML document 
can be updated in the underlying relational database based on an assigned category, 

15 Claims 12 and 1 6 have similar limitations to claim 1 , 

16 The work noted in this Affidavit took place at the Murray Hill Facility of Lucent 
Technologies in Mutiay Hill, New Jersey,. 

1 7. All statements made herein of our own knowledge are Hue, and all statements made 
on information and belief are believed to be true. 

18. I understand that willful false statements and the like are punishable by fine or 
imprisonment, or both, under 18 US C §1001 , and may jeopardize the validity of the 
application or any patent issuing thereon 

Date: I2/I*JZ*** 

Philip Bohannon 

Date: 

Xin Dong 

Date: _ 

Henry F. Koith 

Date: 

Suryanaiayan Perinkulam 
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of how to tr anslate the deletion of a node from an XML view document into base view 
updates, and the categories foi which such deletions cannot be performed Generally, 
Algorithm 2 shows how a deletion is performed for each category . 

1 4 The implementation of the View-Update Execution Module was a reduction to 
practice of a software program that determined whether an update to the XML document 
can be updated in the underlying relational database based on an assigned category 

1 5. Claims 1 2 and 1 6 have similar limitations to claim 1 , 

1 6 The work noted in this Affidavit took place at the Murray Hill Facility of Lucent 
Technologies in Murray Hill, New Jersey. 

17 All statements made herein of our own knowledge are true, and al I statements made 
on information and belief are believed to be true.. 

1 8 I understand that willful false statements and the like are punishable by fine or 
imprisonment, or both, under 18 U S C §1001, and may jeopardize the validity of the 
application or any patent issuing thereon 

Date: 

Philip Bohannon 

Date: 

Xin Dong 

Date: S?-/g-3tcpg 

Henry F.Korth 

Date: 

Suryanaiayan Perinkulam 
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statements made on itfonna&ort and belief are believed to be true 

.18 I t^&staad that wflj^^^ like .ate piioisliable by 

fine oi iiupxisoiraent, or both, wider 1 8 U S C §100 1, and may jeopardize the validity of 
the application oi any patent issuing thereon 
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3. GOVERNMENI CONTRAd u^VENTION | , , , 

Was the invention made under a government contract? I J Yes j x | No 

4 PRESENT STATE OF THB ART 

Briefly describe the closest already-known technology that relates to the invention This w ould 
include, for example, already existing products, methods or compositions which ate known to you 
personally or through descriptions in publications or patents 

There are three practical approaches addressing the view-update problem in a general setting 
(typically relational, and not addressing any XML-specific issues) One is to regard the underlying 
database and the view as abstract data types, with the updating operations predefined explicitly by the 
DBA [12, 13] The second determines a unique or a small set of update translations based on the 
syntax and semantics of a view definition [7, 10]. The algorithm presented in [10], given that the 
underlying base tables are in Boyce-Codd-Notmal-Borin, generates a query graph for the select- 
project-join view, and, based on the graph, gives a list of templates for possible translations of 
deletion, insertion and replacement operations on the view into' certain update operations on the 
underlying database This work is further extended by [4] for object-based views The third approach 
performs run- time translation [14] transforms the view- update problem into the constraint- 
satisfaction problem (CSP) T with the exponential time complexity in the number of constraint 
variables. [6] gives tun-time translations of view tuple deletions using data lineage, but claims the 
current state of the art in data lineage is not applicable to other view update operations like insertions 
Recent work on XML updates [17] studies this problem in the context of XML shredding using the 
inlining method [16] Mining defines a specific procedure for the conversion of a given XML schema 
into a relational schema and the storage of XML, data in a relational database conforming to that 
schema The original XML schema alone determines the update strategy, Thus, the prior work on 
updates specifically applying to XML-defined views considers only a case where the underlying 
database fits a particular type of database schema. The work here considers any well defined schema 

5.. ADVANCEMENT M S TA TE OF THE ART 

Briefly desciibe the unique advancement achieved by the invention This may be done, for example, by 
describing a problem with the prior art that is solved or specific objects that are achieved by the invention. 

There are three characteristics of our invention that are unique: 

1) The relational database schema and constraints on which the XML views are defined are predefined 
and arbitrary That is, our algorithm works based on any arbitrary given schema that is predefined by 
a pre-existing database design 

2) The XML view is defined based on the database, whereas prior work on XML views assumes the 
database is defined based on a given set of XML data. 

3) The XML view and updates through that view are always synchronized with the base (relational) 
data, whereas current systems treat XML views as a non-updateable cache 

6 HOW ACHIEVED 

Briefly describe the invention and how it achieves the advancement described in paragraph 6 

In this paper, we develop a framework fbi XML updates that is sufficiently general and flexible to deal 
with any XML view over virtually any relational schema Based on this, we present algorithms to 
translate an XML update into update operations on the underlying relational database,. These algorithms 
are based on a "view-relationship graph" that encapsulates relationships and constraints on view data 
based upon both the view definition and the underlying relational database schema We then partition 
the graph based on the graph structure such that each partition has specific well-defined properties in 
terms of how updates are to be performed (or disallowed) In addition, we describe an implementation 
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on top of an existing da< system, the ROLEX system that wf 1 on top of the DataBIifc IM main 
memory database system K xhese two systems axe not named in u^ papei as part of our effort to 
anonymize our submission). Our framework ensures the consistency of the updated database, and takes 
into consideration constraints both in the underlying relational database and in the XML-view definition 
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ABSTRACT 

Recent XML middleware systems bridge between XML 
applications and relational database systems by supporting XML 
publishing and querying. Update operations, however, arc not 
well supported, particularly when the underlying relational 
database not only serves XML applications, but also is accessed 
directly by relational applications In this paper, we develop a 
framework for XMI. updates that is sufficiently general and 
flexible to deal with any XMI, view over virtually any relational 
schema Based on this, we present algorithms to translate an 
XML update into update operations on the underlying relational 
database In addition, we describe our implementation on top of 
an existing database system Our framework ensures the 
consistency of the updated database, and takes into 
consideration constraints both in the underlying relational 
database and in the XML -view definiiion The experimental 
results show that trie system operates correctly with reasonable 
performance 

1 INTRODUCTION 

In recent years, XMI has gained widespread popularity and has 
begun to play an important role for information representation 
and exchange Although XML,-based applications may be 
developed from scratch, in most cases they interoperate with 
existing SQL-centric applications Recent work addresses the 
techniques required to define the mapping between XML data 
and relations, to format SQL output as XML, and to translate 
queries posed in XML query languages into SQL [1, 9, 15] The 
next step in making them into fbll-featured interoperation tools 
is to explore the management of update operations 

We consider the case of an XML-based application that is 
buik upon an existing relational database that serves traditional 
RDBMS applications as well There ace three characteristics of 
this architecture: (1) the relational database schema and 
constraints are predefined and arbitrary; (2) the XMI view ia 
defined based on the database; and, (3} the XML, view and 
updates through that view are always synchronized with the base 
(relational) data This contrasts with prior work (which we 
discuss in more detail below) in which XML views are read- 
only or in which the underlying relational schema is restricted to 
be one derivable via "XML shredding * Our work on updates 
through XML views is closely related to the vast body of earlier 
woik on updates to relational databases through relational views 



Many of the problems addressed in that work apply to our 
domain as well, and this paper focuses on those aspects unique 
to our assumption of XMI. -defined views 

We base our woik on the redacted 1 system, which 
provides a declarative language to extract data from an exhiing 
relational database and generate an XML version of data This 
system provides all of the basic features of standard commercial 
relational products Regardless of the specific underlying 
relational system, our approach of supporting updates through 
XML views allows such issues as concurrency, recovery, and 
many aspects of consistency and integrity checking to be done 
by tie underlying database system. This contrasts to XML- 
publishing environments such as [15] or commercial relational 
systems, in which each application typically caches its own 
materialized XML view and much of the capability of the 
underlying database system goes unused 

1.1 Previous Work Based on Shredding 

The recent work on XML. updates [17] studies this problem in 
the context of XML shredding using the mlining method [16] 
Inliaing defines a specific procedure foi the conversion of a 
given XMI, schema into a relational schema and the storage of 
XML data in a relational database conforming to that schema 
The original XMI schema alone determines the update strategy 

In our framework, we regard the relational schema as 
predefined and essentially arbitrary Both the relational schema 
and the XML schema form the input of the update problem in 
our work Consequently, many more possible cases need to be 
considered The solution must be general and flexible enough to 
deal with any XML view over any relational schema In 
particular* our approach is suitable for existing relational 
databases that are now to be accessed by XML^based 
applications as well 

I wo examples can demonstrate the complexity in handling 
the side-effects of updates and contrast the assumptions of [17] 
with those of out approach: 

Example 1: Given the relational database and Its schema as 
shown in Figure l(a) t and the view definition 2 shown in Figure 
1(b), the resulting XML document fiagment is shown in Figure 
1(c) Using inlining algorithm as in [17], a different set of base 
tables with a different relational schema would be generated 
from the view, as demonstrated in Figure 1(d) In the latter 
schema, the Melroarea relation, rather than the Hotel relation, 
has a foreign key Instead of having a single tuple for each metro 
area, there is a separate Metroarea tuple for each hotel When a 
hotel node in the document is deleted, it is easy to see that all 



Ilie name of the system has been redacted (removed) to 
conform to SIGMOD's double-blind review policy Details 
will appear in the final version of the paper 

The tag-qu&ry notation is derived from that of SiftRoufe [9] 



Metroarea {mED, mName) 
Hotel (hfD JiNarne m_id) 



mlD 


mName 


101 


Northwest 



hED 


hName 


mji 


201 


Courtyard 


101 


202 


Doubletree 


101 



(a) Original schema and database 

<hotel> 

($h = SELECT hName 

FROM Hots! 

) 

<m8tro> ($m = SELECT mName 
FROM Metroarea 
WHERE mJD = $h mjd 
)<J\n&ro> 



*/hotel> 



(b) View definition 
Root 



hotel' 



hotel 



hName metro hName metro 

I 1 I 1 

1 mName mName 

"Courtyard" I "Doubletree" | 



"Northwest" 



"Northwest" 



(c) XML document fragment 

Hotel (hlD hName) 
Metroarea (mlD mName, hjd) 



mlD 


mNam« 


hid 


101 


Northwest 


201 


102 


Northwest 


202 



bID 




101 


Courtyard 


202 


Doubletree 



(d) Tnlining generated schema and database 

Figure 1: Example 1 

the tuples related to its child nodes, more specifically, the related 
Metroarea tuples must be deleted However, in the original 
database of Figure 1(a), the Metroarea tuple must be preserved 
for othe: hotels As we (unlike [17]) permit etehei database 
schema as our underlying schema, we need to find a way to 
distinguish them and give different update plans 

Example 2; Figure 2(a) shows a relational schema, and Figure 
2(b) shows a view definition The original relational schema is 
made Tip of three relations, while the inlining- generated schema 
based on the XML view shown in Figure 2(c) consists of only 
two relations Given the latter schema, the deletion of a metro 
node can be executed in a straightforward manner by 
propagating the deletion to the Confroorn table If we decide to 
do the same thing under the original relational schema, the 
Hole] table is affected due to the foreign-key constraint As the 
Hoiel table is invisible to the XML application, we consider any 
operation on it un-desirable. 



Metroarea (mlD t mNams) 
Hotel (hlD htfame, m_id} 
Confroom (c!D roomnurn, hjd} 

(a) Original schema 

<metro> 

($rn = SELECT mName 

FROM Metroarea) 

<conference-room> 

(Sc - SELECT cID, roomnurn mjd 

FROM Confroom, Hotel 

WHERE ConfroorrLhJd - Hotel WD 

AND Hotel mjd = $m mlD 
)</conference-room> 

</metro> 

(b) View definition 
Metoarea (mlD, mName) 
Confroom (clD roomnurn m_jd) 

(c) Intining generated schema 
figure 2; Example % 

The above examples show that updates through XMI 
views are more challenging to manage when the underlying 
relational database schema is arbitrary and not the one derived 
from the view via inlining The challenge arises from the fact 
that the XML view does not determine a unique relational 
database schema, and so, assumptions about the specific nature 
of the database schema cannot be built into the view-update 
algorithms 

1 2 Previous Work on View Updates 

Ifte view-update problem in relational databases is a long- 
standing issue that has been studied extensively A survey of 
research on the view-update problem is presented in [8] An 
abstract formulation of this problem is given by the view 
complementary theorem in [3] [5] analyzes the complexity of 
automatically finding a minima] complement view for updates 
and shows the problem to be NP-compleEe [2] proves that 
deciding whether a side-effect free update solution exists is 
generally MP -hard 

There are three practical approaches addressing the view- 
update problem One is to regard the underlying database and 
the view as abstract data types, with the updating operations 
predefined explicitly by the DBA [12, 13] The second 
determines a unique or a small set of update translations based 
on the syntax and semantics of a view defmition [7, 10] The 
algorithm presented in [10], given that the underlying base 
tables are in Boyce-Codd-Normal-Form, generates a query 
graph for the select-project-joEn view, and, based on the graph, 
gives a list of templates for possible translations of deletion, 
insertion and replacement operations on the view into certain 
update operations on the underlying database Ihis work is 
further extended by [4] for object-based views The third 
approach performs run-time tianslation [14] transforms the 
view-update problem into the constraint-satisfaction problem 
(CSP), with the exponential time complexity in the number of 
constraint variables [6] gives nin-tirne translations of view tuple 
deletions using data lineage, but claims the current state of the 
art in data lineage is not applicable to other view update 
operations ]ike insertions 



Our XML view-update algorithm follows the line of the 
second approach It shares the basic idea of deriving update 
methods from the view definition. We adapt the object concepts 
of [4] in our XML-based model However, the XML model has 
features that distinguish it from the object model For example, 
the XML document helps us decide the propagation direction* In 
Example 1, no mattes whethei the table Hotel has the foreign 
key pointing to Metroarea or the opposite, the propagation 
should follow the direction from Hotel to Metroarea. On the 
other hand, the possibility {indeed, the reasonability) of 
repeating certain data in different parts of an XML hierarchy 
raises more restrictive preconditions for an XMI view to bo 
updatable Moreover, as an element can correspond to either a 
tuple or a field, the same type of update operation in XML needs 
to he translated into different kinds of relational update 
operations Finally, some special XML view features such as 
transitive relationships and IDJREF references (which we will 
discuss in Sections 3 4 and 3 5) bring more complication into the 
problem 

1,3 Contributions 

In this paper, we describe the features of a working prototype of 
an XML view-update manager above the redacted system Our 
system provides side-effect checking, DTD validation, 
constraint checkings and finally update translation and 
execution The key contributions of this paper are summarized 
as follows: 

1 We develop a framework for the deteimmaiion of element 
updatabilzty and the resulting impact on underlying tables 
Based on this, we design algorithms for update translation 

2 After breaking up update opeiations into various sub-tasks 
such as DTD validation, constraint checking, and 
translation, we assign each sub-task to the XMI view side 
oi the relational database side as appropriate 

3 We present effective mechanisms to deal with constraints, 
re-organizing constraints from the database schema as well 
as the view schema, to improve performance 

4 Lastly, we implement the main part of the view update 
architecture in an existing system Experiments show that it 
works correctly and that the performance is reasonable 

The paper is organized as follows Section 2 discusses the 
XML, view'update problem Section 3 presents our view-update 
algorithms. Section 4 presents the architecture of our 
experimental system We conclude and discuss future work in 
Section 5 

2. PROBLEM DEFINITION 

In this section, we first list the set of allowed update operations 
and then introduce our running example 

2>1 XML Update Syntax and Semantics 

In this paper, we do not focus on the specific syntax for the 
expression of XML updates , ([] 1], and others provide a syntax 
within XQueiy for updates,} Regardless of the specific syntax 
used for XMI updates, they can be divided into several 
categories as discussed below 

Ihe first distinction we dcaw is among nodes of an XMI 
document that are materialized from an XML view A text node 



in an XML document represents the string or numeral value of a 
PCDATA element or an attribute. We refer to this as a value A 
non-text node (or node > when no contusion arises) is one that is 
not a value We examine update operations only on nodes and 
not on values, as the latter can be transformed easily to the 
replacement of a node Among the nodes in an XML document, 
the node representing a PCDATA element or an attribute is 
called a leaf node Other non-root nodes are called branch 
nodes 

We consider XML update operations that touch the data but 
not the tags Tag modification would result in schema change, 
which is beyond the scope of this paper Data-update Operations 
include deletion, insertion, movement, and replacement. In an 
XQuery-based syntax, these operations make use of the XQuety 
FIWR (FOR, LET, WHERE) statements: iterator, assignment, 
and conditional to locale the nodes for updates We ignore order 
issues because the view-definition language used in redacted 
system does not offer a mechanism to define element order 
These operations may be categorized as follows: 

• A deletion is the removal of the indicated node, as well as 
any nodes oi values contained within the selected node 
Stated in terms of an XML document, the delete operation 
removes the entire subtree rooted at the selected node A 
node that is the obligatory child of its parent node 
according to DTD cannot be removed A node referenced 
by other nodes using IDREF cannot be removed 

* An insertion adds a node together with its descendants and 
values under a parent node In an XMI document, the 
operation inserts a subtree into a certain location.. 'The entire 
subtree is given in the insertion command The insertion of 
a node not conforming to the DTD oi a node referencing by 
IDREF non-existing nodes ts not allowed 

• A movement moves the node together with its descendants 
and values from the old position to the new position, under 
another node with the same type as its original patent DTD 
cardinality constraints must be observed Note that a 
movement does not equal deletion followed by insertion 
because movement preserves the identity of the node 

* A replacement can be regarded as deleting the old node and 
inserting the new node in one transaction DTD and IDREF 
consistency need to be enforced with regard to the 
replacement as a whole The deletion and insertion steps of 
a replacement may cause a temporary violation of 
constraints 

Finally, in an XMI view 7 data from a relational tuple could 
appear in multiple parts of the XML document based on the 
view definition We base our discussion on the assumption that 
when a user updates a node in the XML view, she means to 
update that specific part and does not expect any changes to the 
rest of the view In othei words^ we aim to fulfill the update on 
the indicated piece of data while keeping the rest of the view 
intact This is in contrast to the approach taken by [4], where 
changes to the indicated object may be cascaded to other 
instances in the object view However } our algorithm framework 
can be easily modified to allow cascades of updates to various 
parts of the materialized document 



Metroarea (mJD, mName) 
Slate (slD sNama) 

Hotel (hlD, hNarrte starrating pool gym street, city staiejd 

metrojd) 
Phone (phlO. phoneNo) 

Confroom (cID. croomnum capacity, rackrate, c_hjd) 
Guestroom (glD, roomnum, type, rackrafe g_hjd) 
Availability fa 10, startdate, enddate price ajjd) 
Restaurant (restfD rNamB, rCtry} 

(a) Relational database schema 

<metra> 

($m = SELECT' m Name FROM Meircarea) 
<hatel> 

($h ■ SELECT hNarne starrating pool gym 
FROM Hotel 

WHERE poo! > 0 AND metrojd = $m mJD) 
<$Me> 

($S = SELECT sName 
FROM Slate 

WHERE slD = $h state Jd 
}*/state> 

<conference-room> 

{$c = SELECT croomnum, capacity 

FROM Confraom 

WHERE rackrate > 2 AND c_hjd = $h hlD) 

<phone-number> 

($p ~ SELECT phoneNo 

FROM Phone 

WHERE phID = $h hlD 

)</phonfi-number> 
</confeyence-room> 

<guest-raom> 

($g= SELECT roomn urn type 
FROM Guestroom 

WHERE rackrate > 2 AND g_h_id = $h hlD) 
<availability> 

($a = SELECT slartdate enddats, price 
FROM Availability 
WHERE ajjd - $g gfD 
)</avaNabiflty> 
</guest-roorn> 

<nearby-restaurant> 
($r - SELECT rName rCJty 
FROM Restaurant 
WHERE rClty^Sftcuy 
)</nearby-Testaaram> 
</hoteJ> 
</matro> 

(b) View definition 
Figure 3: Example 3 

2 2 Problem Definition and Assumptions 

Ihe problem wc are trying to solve is defined as follows: given 
an underlying relational database and its schema, and ao XML 
view definition over that database, how should the system 
translate an update against the XML. view into corresponding 
updates against the underlying database without violating 
consistency? By consistency, we mean that three criteria must be 
satisfied First, updates should be side-effect free [10] That is, 
the semantics of the update performed on a materialisation of 
the view must yield the same result as the regeneration of the 
XML view after performing the traaslated update on the 



underlying database If a side-effect free translation does not 
exist, the specific node is non-i\pdatable. Second, the updated 
XML view must be consistent with the view definition and the 
DTD e^licitiy given in or derived from the view definition 
Third, the updated data in the underlying database must comply 
with the relational schema and the constraints for die underlying 
database 

We base our discussion on the redacted system view- 
deSnition language although the algorithms we present can be 
modified easily for other XML view-definition languages We 
divide predicates appearing in the WHERE clause into two 
parts Ihe predicates involving binding variables are called 
correlation predicates The other predicates are caned nan- 
correlation predicates The correlation predicates indicate the 
relationships among XML nodes "When we remove the 
correlation predicates, each SQL query for a single node can be 
regarded as a relational view that is isolated from any other 
nodes We call that an dement base view 

We make the following assumptions rbi the rest of the 
paper: 

• There are no aggregates, order-by, oi group operations 
These operations usually make views non-^pdatable, as 
was established in priot work for relational views [10] 

• The underlying relational database is in BCNT [7, 10] 
discuss in detail the necessity of this assumption for 
preservation of data dependencies 

• An element does not have more than one child node with 
exactly the same type and the same content 

2 ,3 XML View Update Example 

Out running example for the remainder of this paper is drawn 
from a conference-planning application 

Example 3: Figure 3(a) shows the underlying database schema 
Figure 3(b) defines the XML view From the foreign-key 
constraints of the underlying database (which we do not show in 
the figure), we determine the relationship of view nodes 
metro:hotel to be one-to-many (l:n), hotel :state to be many- 
to-one (n:l), hotelxonference-room to be 2:n, batelitjuesi- 
room to be l:% guest-room availability to be l:n, 
hotel:nearby-restaurant to be rnany-1o-many (m:n) ? and 
hotehphone-numberto be one-to-one (1:0 (phID acts both as 
the key and the foreign key of Phone; as a foreign key, it 
references to hlD in the Hotel relation ) 

3, UPDATE TRANSLATION ALGORITHMS 
3,1 Overview 

As any XML update is based on a subtree instead of a single 
node, the particular update may affect many nodes in the subtree 
besides the indicated node itself As a result, during translation, 
while considering the updates against the relational tupJe(s) 
corresponding to the node itself, we also need to take into 
account the tuples related to the descendant (or child) nodes 
This process is called propagating the update from the parent- 
node element base view to the child-node element base views 

Given an update on an XML nade J we decide (1) whether 
the node is updatahle for that specific update type; (2) how to 
propagate the update; and (3) what type of updates (insert 
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Figure 4: View-relationship graph of Example 3 



delete, replace) should be performed on the element base 
view(s) These decisions aTe made by examining the view- 
relationship graph that describes the relationships between node 
pairs in the XMT view Based on the decisions we propose an 
update plan on the element base view(s) Then, we rely on a 
relational view-update algorithm, such as the one described in 
[10], to obtain the correct update plan on the underlying 
relational database 

En the remaining sections, we first present out algorithm to 
generate the view- relationship graph and the update plan for a 
basic case in which correlation predicates are between a parent 
node and a child node Then we extend the algorithm io tackle 
correlation predicates between a node and its any ancestor. 
Finally, we discuss the changes needed to manage IDREF 
attributes 

3 2 XML View- relationship Gi aph 

To visualize the relationships between node pairs hi fee XML 
view, we transform the XML view into an XML view- 
relatiomhip graph. In an XML document, element tags and 
attribute names indicate the type, of the node A node M is called 
the direct parent of node jV, if M is the parent of N in the XML 
view; N is called flie direct child of M In the view-relationship 
graph, we add annotated edges between each node and its direct 
parent to indicate the cardinality relationship We use <- to 
annotate a t:n relationship, — > to annotate a n;l relationship, 
to annotate a 1:1 relationship, and finally — to annotate a m:n 
relationship The view-relationship graph for the running 
example is shown in Figure 4 The root of the view-relationship 
graph is the node corresponding to the root of the XML view 
document 

The cardinality relationship of a node pair is decided by 
correlation predicates in the view definition If the correlation 
predicate in the child node is of the form ForeignKey = 
SbindingVar.Key, where $bindingVar represents the direct 
parent node, the relationship between the direct parent and the 
child is 1 :n If the correlation predicate is of Ike form Key ~ 



SbindingVar ForBignKey, the relationship between the parent 
and the child is n:l If the foreign key also acts as a key for the 
element base view, the relationship is labeled 1:2 If there is no 
correlation predicate between the parent and the child, or the 
predicate is not equality, or the comparison is not between a 
foreign key and its referenced key, then the relationship is 
labeled m:n If there are several correlation predicates between 
the same pair of nodes, we follow the precedence of 1:1, n:l, 
l:n, and m:n, (highest to Lowest)* to assign a cardinality 
relationship In the above discussion, the terms key and foreign 
key refer to those of the element base view. F or instance, in 
Example 2, the key of the node conference-room is 
Gonfroom cID, and the foreign key is Hotel mjd 

According to the cardinality relationship between node 
pairs, we partition the graph into categories In [4], relations are 
grouped into categories based on object definition, including 
subset, ownership and reference relationships, which imply 1:1, 
l:n and n:i cardinality relationships respectively The mowy-to- 
many relationship is not considered. We base our categorization 
entirely upon cardinality relationships indicated by foreign-key 
constraints in the underlying relational database This helps us 
capture more semantics inforraation Because of the motivation 
provided by [4], we have chosen a convention similar to their, 
work in naming our categories 

Definition 1 An overlap island (OI) is a maximal subtree of the 
view-relationship graph with a root N that satisfies one of the 
following 

1) N has a direct parent outside the overlap island, and the 
relationship between N and its direct parent h m n 

2) There are other nodes that get non-exdu s we data fivm the 
same relation as N M and N has a relationship other than 1 n 
with Us parent 

(Certain overlap islands, will be identified m Section 3 4 a$ 
falling into the special category ofttamitive archipelagos ) 



The root of the subtree is an OI-toot A node in the XML 
document that corresponds to a nods in an overlap island is an 
OI-iiode If it corresponds to an Ol-root. it is an Ol-root-node 

□ 

Observation I Given on Ol-root-node its direct parent can 
have more than one direct child node of its type For any 01- 
node N, vtker nodes m the XML. document may obtain their 
vaktet from the same relation tuple(s) as N. 

Definition 2 The dependency continent (DC) /5 a maximal 
•subtree of the vtm-relation^hip graph such that all of the 
following hold 

1) The root of the subtree h the root of the view-relationship 
graph 

2) The cardinality relationship between a branch node in the 
subtree and its direct parent is J I or ft J 

3) No node in the subtree is a node in an overlap island. 

A node in the XML document that corresponds to a node in the 
dependency continent is a DC-nOd& □ 

Observation 2 For a given view-relationship graph there exists 
only one dependency continent Each branch node in the 
dependency continent has a I 1 or n I relationship with its 
direct parent, and thus I / ot n / relationship with the root of 
the view-retattowhip graph Given a DC-node N, no other node 
in the XML document obtains its valuefs) from the same relation 
iuple(s) as N. 

Proposition 1 Given a DC-node all its ancestor nodes ate also 
DC-nodes 

Definition 3 A referenced peninsula (RP) is a maximal subtree 
of the view-definition graph such that both of the following hold 

1) The root R of the subtree has a direct parent in the 
dependency continent, and the relationship between i? and 
Us direct parent is J n, 

2) No node in the subtree is a node in an overlap htand 

The root of the subtree is calkdan RP-root A node in the XML- 
document that corresponds to a node in the referenced 
peninsula is called an RP-node If it corresponds to an RP-root 
it is called an RP-root- node □ 

Observation 3 Given an RF-root-node its direct parent has 
only one direct child node of its type For any RP-node N r other 
nodes in the XML document may obtain their values from the 
same relation tupie(s) as N. 

Certain XML document nodes have theii data in. the view 
only once Such nodes form the dependency comment The 
other nodes may have duplications in the view If multiple 
direct-parent nodes reference the same data via a foreign key, 
which implies the parent node can have just one child node of 
the given type, then the child node constitutes the root of the 
referenced peninsula Else, if a direct patent csn have multiple 
child nodes of the same type, we categorize the child nodes as 
being in an overlap island The theorem below follows from this 
discussion 

Theorem 1 The dependency continent, refer enced peninsulas, 
and overlap islands (including those overlap islands 



characterized in Section 3 4 as transitive archipelagos) form a 
partition of the view- relationship graph 

In the view-relationship giaph for the running example the 
dependency continent includes nodes rnetiOj hotel, 
conference-room,, guest- room, availably, and their child 
leaf nodes Ihere is one referenced peninsula, which has the 
node State as its RP-root and the child leaf nodes of state The 
node nearby-restaurant and phone-number, together with 
theii child leaf nodes, form two overlap islands We discuss the 
phone-number element further in Section 3 4 

Tne algorithm to assign categories to each XML view node 
is given as pseudo-code in Algorithm I The category 
assignment can be done in a single traversai of the view- 
relationship graph Hence, the lime complexity is 0(v), where n 
is the number of nodes in the graph 

Algorithm 1 Node Categorization 
procedure n ode- cat-gen(XML Node node) 
begin 

1 ]f [node shares underlying tables with other nodes && 

the- cardinality relationship of node and its parent is not 1;n) 

2 then 

3 nocfe is in Ol 

4 else 

5 switch (direct parent's category) 

6 case DC: 

7 switch (cardinality relationship of node and its parent) 

8 case 1:1: node and its child leaf nodes ana in DC 
9. case n'A: node and its child leaf nodes are in DC 

1 0 case 1 :n: node and its child teaf nodss are In RP 

1 1 case m:n: nocfe and its child leaf nodes are In 01 

12 end switch 

13 case RP; 

14 if (cardinality relationship of node and its parent is m: n) 

15 then 

1 6 node and tts child leaf nodes are in O! 

17 else 

1 8 node and its chiid leaf nodes are in RP 

19 case OJ: 

20 node and its child leaf nodes are in 01 

21 and switch 

16 for (each child branch node $ub of node) 

19 node-cat-genfsai?) 

end 

3 3 Update Propagation Algorithm 

Ihe updatability property and the update execution strategy of 
each category are different We organize the set of possible 
updates by node category (DC, RP, 01) and by operation (insert, 
delete, move, replace) According to the update semantics 
described in Section 2.1, when we update a nc-de, both the node 
itself and the entire subtree rooted at the node are aifected. Our 
execution strategies follow the principles enunciated in [10]: 

1) No side -effects 

2) Oae step changes: only one step of the update execution 
affects a given tuple 

3) Minimal changes: no other valid strategy would require a 
proper subset of the database update operations 

4) Simplest replacement: no other val id strategy would make 
a simpler change such as a proper subset of the attributes 

5) No insert-delete pnits: replacements used instead 



rhe intuition for the update strategy is as follows: Updating 
a node may cause side-effects if and only if tfie underlying data 
to be updated appears in other parts of the view Observation 2 
indicates that a DC-node can be updated without causing side- 
effects According to Observation 3, RP-root-node updates are 
allowed because they can be achieved by replacing the related 
foreign-key values for their direct parents, which are DC-nodes 
Other nodes cannot be updated because of non-avoidabte side- 
effects To enforce foreign-key constraints for the underlying 
relational database, in certain cases we need to propagate 
deletions recursively from a parent node to its chifd nodes, so as 
to ehrninate tuples containing a foreign-key value referencing 
the deleted tuple(s) The detailed update strategy is given as 
below: 

• Deletion of a branch DC-node: 

1 ) Delete the corresponding tuple in the element base view 

2) Propagate the deletion recursively to all branch DC- 
children of the deleted node 

• Insertion off a hi anch DC-node: 

Insertion is allowed only when all of the following hold: 

1) The OI-descendants of the inserted node, as given in the 
insertion, include exactly those descendant nodes that 
can be derived from existing tuples hi the database that 
satisfy the correlation predicate^) 

2) Each branch node in the inserted subtree has a Leaf child 
corresponding to the key of the element base view 

Insertion is performed as follows: 

I) Insert the corresponding tuple* with the foreign-key 
values equal to the key values of its direct parent, into 
the element base view 

2} Propagate the insertion recursively to all branch DC- 
children of the inserted node 

3) Propagate the insertion to its branch RP-descendants that 
contain new values (Note that this is not a recursive 
process because according to the rules discussed below, 
non-root RP-nodes cannot be inserted ) 

• Movement of a branch DOnode: 

Movement is allowed only when the foreign key in the node 
to be moved does not itself appear in the view as a leaf node 

Movement is performed by setting the foreign key values in 
the element base view ot the DC-node to the key values of its 
new direct parent 

• Deletion pi a leaf DC-node; 

Deletion of a leaf DC-node is allowed only when the node 
does not correspond to a foreign key appearing in correlation 
predicates 

deletion is performed by setting the corresponding attr ibu£e in 
the element base view to NUL L 

• Insertion of a leaf DC-node; 

Insertion is allowed only when the leaf node does not 
correspond to a foreign key appearing in correlation 
predicates 

Insertion is performed by assigning a value to the 
corresponding attribute in the element base view 

• Deletion of an RP-i oot-node; 



Deletion of an RP-root-node is allowed only when the foreign 
key of the parent node does not appear in the view as a leaf 
node (within the parent) 

Deletion is performed by setting the foreign-key values in the 
element base view of its direct parent to NULL 

* Insertion of an RP-s oot-node ; 

Insertion is allowed only when all of the following hold: 

1 ) Ihe foreign key of the pared node does not appear rn the 
view as a leaf node (within the parent); 

2) Ihe OI-descendants of the inserted node, as given in the 
insertion, include exactly those descendant nodes that 
can be derived from existing tuples in the database that 
satisfy the correlation predicate® 

3) Each branch node in the inserted subtree has a leaf child 
corresponding to the key of the element base view 

Insertion is performed as follows; 

1) Set the foreign-key values in the element base view of its 
direct parent to the key values in its element base view; 

2) Insert the corresponding tuple into the element base view 
if the inserted node contains new values; 

3) Propagate the insertion to its branch RJP-descendents that 
contain new values 

* Updates of a non-root RF-node: not allowed 

* Updates of an Ol-node: not allowed 

We do not enumerate the rules for replacement and 
movement above, with the exception of bjanch DC-nodes, as 
they can be easily derived from the jules for deletion and 
insertion 

Proposition 2 The update propagation algorithm correctly 
translates the updates on the XML' view into th$ updates on the 
element base views that observe the 5p?inciples of [10] 

Given above update strategies, it is easy to develop 
algorithms to generate the update plan As an example, we give 
the algorithm for translating the deletion of a node from an 
XML view document into the element base view(s) updates. In 
addition to considering side-effects, we take not-null constraints 
into consideration, and explore this further in Section 4 3 

Algorithm 2 Deletion Translation 

procedure node-deIete(XMLNode node) 
begin 

1 switch (the category of node) 



2 case DC; 

3 if (node Is a leaf node) then 

4 if (nods is not a required child of its parent) then 

5 for the element base view of its parent set the 

conespondlng attribute to NULL 

5 else 

7 node cannot be deleted according to DTD 

8 else 

9. delete the corresponding tuple from element base view 

10 for [each chlfcf branch DC-node sub of node) 

11 node-delete(soJb) 

12 easoRP: 

13 if [node is an RP-root-node] then 

14 if (node is not a required child of Its parent) then 

15 for the element base view of its parent sBt the 

corresponding foreign key to NULL 

16 else 



1 7 node cannot be deleted according lo DTD 

1 B else 

1 9 node cannot be deleted to avoid side -effects 

20 case Of: 

21 node ca nnot be deleted to avoid side-effects 

22 end switch 
end 

3.4 Extended Algorithm ibr Transitive 
Relationships 

We call the relationship between a node and its direct parent a 
direct relationship, and the relationship between a node and its 
ancestors otfiet than its direct parent an indirect relationship In 
Example 3, the node phone-number has a mm direct 
relationship with its parent conference-room. In addition, it 
has an indirect relationship with its ancestor hotel The 
correlation predicate of phlD - $hhfD indicates that the 
cardinality relationship between hotel and phone-number is 
1 : 3 This case was not covered in Section 3 2 and 3 3 

Definition 4 A transitive relationship is a non-m n relationship 
defined by a correlation predicate between a node and an 
ancestor node othet than its direct parent The ancestor is called 
its transitive parent □ 

Definition S A transitive relationship is called an effective 
transitive relationship if both of the following hold 

J) The child node N has a m n relationship with its direct 
parent P 

2) Node N has a transitive relationship with some ancestor T 

7 is called a real transitive parent of N. N is called a real 
transitive child of T, andap&buda child ofP □ 

Observation 4 Given a node that has a I J or I n relationship 
with its teal transitive patent its direct parent has no more than 
one child node of its type Given a node that has a n I 
relationship with its real tr ansitive parent its direc t parent can 
have more than one child node of its type 

In Example 3, ihere exists an effective transitive 
relationship between the node phono-number and hotel The 
node phone-number is a real transitive child of the hotel node 
and a pseudo child of conference-room Because the 
cardinaEity relationship between hotel and phone-number is 
1:1, each conference-room node can have no more than one 
phone-number child 

Definition 6 A transitive relationship is called a double 
transitive relationship if alt of the following hold 

1) The child node N has a non-m n relationship with its direct 
parent P 

2) N has a transitive relationship with some ancestor J\ 

3) 7'he direct or indirect relation s hip between 7 and PUmn 

1 is called the double transitive patent of N N h called a 
double ix ansitive child of T. □ 

Observation 5 Given a node that has a 1 I or l.n relationship 
with either its direct parent or double-transitive parent, its 
direct parent has no more than one child node of its type Given 
a node that has n I relationships with both Us direct parent and 



its doitble-tr-ansitive parent Us dir ect parent can have more than 
one child node of its type 

In the view-relationship graph, we add an annotated dotted 
edge between a node and its transitive parent 

0efinition 7 A tt ansitive archipelago (I A) is a maximal 
subtree in the view-relationship graph with a root node that has 
a DC -node or cm RP-wde as its real transitive par ent or double 
transitive parent. The root of the tt ansitive archipelago is called 
a TA-root A TA-root that has a n 1 relationship with its 
effective transitive DC-parent or hasn I relationships with both 
its direct DC-parent and its double transitive DC-parent h a 
TA DC-root the rest TA-rvof% are rA-RP-ronts. A node in 
the XML document that corresponds to a node in the transitive 
archipelago Is a IA-node If it corresponds to a TA-root, it is a 
TA-root-node □ 

I he nodes in the transitive archipelago aie divided into 
transitive DC-nodes, transitive RP-nodes and transitive 01- 
nodes according to a categorization similar to that of Section 
3 2 

Observation 6 A transitive archipelago h a subset of an overlap 
island. 

Definition 8 A pseudo transitive archipelago (PI A) is a 

transitive archipelago in which the transitive parent and the 
direct parent of the root node have a direct or indirect 
relationship between them that is m n or l.n The toot of the 
subtree is called a FT A- root A node in the XML document that 
corresponds to the node in the pseudo transitive archipelago h a 
PI A-node If it corresponds to a PTA-root. it is a FTA-root- 
node □ 

Observation 7 Given a PTA-rcot-rtode N Us transitive parent 
node P may have more than one descendant node of the same 
type ct$ iy's direct parent node, and thus, may have more than 
one transitive child node that obtains its value from the same 
relation tuple as N. 

Observation 8 Given a non-PTA TA-node N r its transitive 
parent node P has only one descendant node of the same type as 
N's direct parent node thus no other nodes in the subtree rooted 
at P obtain their values from the same r elation tuple as N, 

Notice that the update-propagation algorithm described in. 
Section 3 3 neither allows any update to overlap islands, nor 
propagates any update to 01- descendant nodes With the 
introduction of tiansitive relationship and transitive 
archipelagos, the special subareas of overlap islands, we need to 
adjust the algorithm in the following aspects: 

* Propagation to transitive children: apply the propagations 
described in the algorithm to the transitive TA-cfiildren and 
their descendant nodes, in the same manner as for non- 
transitive descendant nodes in corresponding categories 

* Updates of a non-PIA IA-noder apply the same algorithm 
as that for a non-TA-node in the conesponding category 

* Updates of a P I A-node: not allowed 

The idea behind the above update strategy is that a IA- 
node that is not a PTA-node, does not share data with other 
nodes in the subtree rooted at iis frarisitive parent, and thus can 
be treated the same as a non-TA-node in the corresponding 



category This can. be inferred fiom Observation S However, 
according to Observation 7 A a PTA-node could possibly be 
sharing data with other nodes in the subtree rooted at its 
transitive parent Therefore to avoid side-effects, we do not 
allow updates on the node, but we permit propagation from the 
transitive parent to all such descendants 

In Example 3, the node phone-number belongs to a 
pseudo transitive archipelago, thus cannot be updated While the 
updates of the node Hotel need to propagate to it, the updates of 
the node conference room have no effect on it 

Proposition 3 The update propagation algorithm after the 
adjustments described above correctly implements the updates 
and observes the 5 principles of [10] 

Double transitive relationships and effective fcansitive 
relationships cover the cases where either transitive parent-direct 
parent relationship-* or direct parent-child relationship, has mm 
cardinality If both are nonmm, the problem can be transformed 
to the cases discussed in Sections 3 3 and 3 4, by removing the 
transitive parent-child relationship or the direct parent-child 
relationship without changing the update semantics 
Furthermore, the algorithm can be easily extended to handle the 
case where a node has several transitive relationships with 
different ancestors 

3.5 Extended Algorithm with IDRJEF 

IDREF attributes are specific to XML documents A node 
referenced by other nodes using IDREF attributes is called a 
referenced node The IDREF attribute nodes referencing it is 
called referencing nodes We add double lines in the view- 
relationship graph between the referenced node and its 
referencing nodes, with the arrow pointing to the referenced 
node 

A referenced node has exactly the same updatability and 
update plan as common nodes in the same categoty, except that 
deletions are not allowed on a node being referenced by other 
nodes Usually, referenced nodes are in the dependency island 
and can be safely updated 

Ihe referencing node is an IDREF attribute node, thus a 
leaf node It is treated the same as other kaf nodes in its 
category except that during insertion* we need to guarantee the 
existence of the referenced nodes 

3 .6 Completeness <rf the Algorithm 

Proposition 2 and 3 indicate the soundness of the update 
algorithm Proposition 4 characterizes the scope of the 
algorithm 

Proposition 4 Direct relationships, transitive relationships and 
IDREF reference relationships cover all explicitly given 
relationships in an XML view definition 

4 SYSTEM ARCHITECTURE 

To update the XML view, the system needs to parse a given 
updafe command, locate nodes for update in the XML document 
or directly locate related tuples in the database, translate the 
XMLrview update into updates against underlying relations, and 
finally execute the updates In addition, consistency checks need 
to be enforced, which may include avoiding side-effects, DTD 
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Figure 5: XML view update system architecture 

validation, view- definition predicate cheeking, and database- 
constraint enforcement 

In this section* we present a system architecture that divides 
these tasks between the view side and the underlying database 
side. This division is necessary, as neither level alone can 
suffice We have implemented our ideas on top of the redacted 
system The same design idea can be applied to other XML- 
publishing middleware systems 

4,1 System Description 

lo attain higher efficiency, we complete part of the work at the 
view level, which means the view-level middleware system 
takes a first pass at the update task instead of relying on the 
underlying database There are three advantages for this 
approach lirst, doing consistency testing early and in turn 
finding invalid updates early can save unnecessary work in the 
underlying relational database. Second, static information can be 
collected from the view definition and the underlying relational 
database schema at the time of parsing the view-demotion This 
information, stored at the view level, can be used during the 
view-update process to improve performance Diird, when the 
work is done at the view side, we have better knowledge about 
the remaining update task, and therefore, can take advantage of 
that knowledge in optimising the operations on the underlying 
database On the other hand, the database side is more efficient 



at accessing data So any update operation or constraint 
checking that needs ancillary data for support is assigned to the 
database side Our system architecture is shown in Figure 5 

At the view side, the parser accepts the view update 
command, gets information about the node to be updated, such, 
as its tag or attribute name, the contents for insertion or 
replacement, the XPafii and XQuery predicates for location, and 
so on After that, the side-effect checker decides whether the 
node can be updated without causing side-effects Then, the 
DTD checker determines whether the node is updatable 
according to the DrD, and whether the segment foi insertion or 
replacement follows the DID 3 Next, the system checks local 
constraints, including whether the inserted data violates domain 
requirements (declared in the underlying database schema), 
selection predicates (given in the view definition), etc The three 
checks can be performed independently, and therefore, in 
parallel After the above checking, the update command is 
translated into update operations against the underlying database 

The work assigned to the database side consists of three 
parts: locating tuples for updates, examining constraints, and 
executing updates The constraints examined at this point, called 
global constraints, are across tuples and relations Such 
constraints can he checked only with the knowledge of data in 
other tuples Accessing data and checking those constraints can 
be performed more efficiently at the database side We discuss 
local constraints and global constraints further in Section 4 3 

A subtle decision is when and how to locate the data for 
updates One option is to compose the XQuery and XPath 
predicates with the view definition and transform them into an 
SQL update against the underlying table, leaving location task 
entirely to the database system An alternative scheme is to 
locate the candidate nodes at the view level and update each of 
them Some nodes cannot be located without processing at the 
view level, such as those including // or * in the XPath 
expression The possibility and efficiency of pushing down the 
processing of location into the relational database is a subject fot 
future research 

42 Information Collection while Pat sing 
View Definition 

The processing of operations at the view side requires 
information about the view definition and the relational schema 
This information is static throughout the view-definition life, 
and thus can be collected while parsing the view-dermilion Fox 
each node in the XML view-definition, we need the following 
information: 

• The underlying table for the data in die node if the node is 
derived from an attribute of a relation, we also record the 
attribute 

• The node's parent, direct children, and transitive children 

• Ihe category of the node {Note that these three items 
constitute the information built by the view-relationship 
graph ) 



In case the DID is not explicitly given, we can derive it from 
the XML view definition and the underlying relational 
database schema 



■ The updatabiHty of the node according to the DT D 
* The local constraints of the node 

The first three items can be obtained front the view query 
that defines tie XML view and the algorithms defined in Section 

3 Local constraints are collected from the view-query and the 
underlying relational database schema as discussed in Section 

4 3 

4,3 Constraint Satisfaction 

An important task in maintaining consistency is ensuring 
constraint satisfaction There are three sources fot constraints: 
the view definition, the XML D ID and the underlying database 
schema On one hand, certain database constraints are non- 
enforceable at the view level either because they involve data 
that do not appear in the view or because the application 
defining the view lacks the requisite authorization Meanwhile, 
certain constraints arising from the DTD, must be enforced at 
the view level On the other hanti^ some constraints from one 
level can be translated into constraints at the other In those 
cases, the choice of level is driven by concerns of efficiency and 
effectiveness 

Instead of handling the constraints where they are defined 
(at the sources^ we categorize them into two classes based on 
the number af tuples used to enforce the constraints, Tf only one 
tuple is required to enforce the constraint, we call it a local 
constraint and check it at the view Level; if not we call it a 
global constraint and handle it in the relational database m 
other words, some database constraints can be checked at the 
view level, while some view definition and DTD constraints can 
be translated into database constraints 

View-definition constraints come from selections that are 
non-correlation predicates. (We do not consider correlation and 
join predicates for the element base view because they are 
guaranteed implicitly by the update execution plan We also 
ignore aggregates and order by operations, as we noted eailier ) 
Predicates from selections are enforced on a single tuple and 
therefore can be enforced at tie view In the DTD, cardinality- 
related constraints should be transformed and checked at the 
relational database, as we need data from other tuples to enforce 
them Other DTD constraints can be easily enforced at the view 
level 

There are Eve types of constraints for relational database: 

1 Key constraints: Key constraints are global constraints, as 
a base table scan or an index lookup needs to be performed 
to rule out the possibility of duplicate keys 

2 Foreign-key constraints: Constraints included in 
correlation or join predicates of the view can be enforced 
by update execution plans However, if there exists a key — ■ 
forctgn-key relationship between a relation present in the 
XMI view definition and a relation not involved in the 
view definition, we categorize the constraint as a global 
constraint that will be handled by the relational database 

3 DaiuaJu constraints: The effect of domain constraints is 
limited to the single tuple, and maybe a single attribute. 
They can be collected while parsing the view definition and 
enforced at the view 



View/DTD constraints 
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Cardinality constraints in 
DTD 
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Foreign-key constraints 
Trigger constraints 



Non-correlation predicates 
in view query 
Non-cardinality constraints 
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Local constraints 



Domain constraints 
Not-null constraints 



Database constraints 
Figure 6: Summary oi constraint categories 

4 Not- nuil constraints: A not-null attribute in a relation that 
is used in the XML view definition should correspond to an 
obligatory leaf node. Such constraints are categorized as 
local consiiamts They can be transformed into DTD 
constraints and enforced at the view during updates 

5 Constiaints defined using trigger's: These constraints are 
considered to be global constraints and enforced at the 
relational database 

In summary, local constraints, including view selection 
predicates, domain constraints, not-nult constraints, and non- 
cardmaliiy DTD constiaints should be collected at the time of 
parsing the view definition and enforced at the view level 
Others are left to the database management system This is 
illustrated in Figure 6 

4.4 Implementation 

We implemented the update architecture based on the redacted 
system The architecture consists oi two modules, the 
information collection module and the view-update execution 
module 

The information-collection module collects the static 
information described in Section 4 2 at the lime when the view- 
definition is parsed and sets up the view-relationship graph The 
view-relationship graph is then translated into update plans that 
are persisted in the system and later used at run time 

The view-update execution module provides the interface 
for deletion, insertion, movement and replacement on a given 
XMI- DOM node at run time. The execution module interacts 
with the relational database and the DOM interface to access the 
underlying data for the XML view 

Ihe two modules are connected through the persisted 
update plans that provide the necessaiy update translation and 
propagation information Experimental results show that the 
system operates correctly, and the performance is commensurate 
with direct execution without use of a view A full performance 
evaluation is the subject of planned future work 

Our prototype implementation has several limitations at 
this time We have not implemented an XQuery based update 
language, and do not use XPam-like syntax to perform node 
location We believe they are noC pivotal to the algorithmic ideas 
of this paper Also, we have not implemented updates for views 



with 1DSEF attributes, as view queries in the redacted system 
currently do not support IDREF(s) 

5, CONCLUSION AND FUTURE WORK 

In this paper, we discussed XMI view updates in the context of 
an underlying relational database that serves not only the XML-- 
based application, but also traditional RDBMS applications. 
Given a pre-existing underlying relational-database schema and 
an XMI view defined on it, we present a framework for 
generating update plans to pecfocm an update without 
introducing side-effects to other parts of the view 

We presented an update architecture that distributes the 
update sublasks between the view and underlying database,, 
relying on the layer where efficiency is higher Underlying 
relational database constraints, view constraints and DID 
constraints are enforced to ensure consistency 

Although we base our discussion and implemented the 
update algorithms on the redacted system and a specific XML 
view-definition language, the discussions and algorithms can be 
easily applied to other systems and languages Future extensions 
to our work could include consideiing element orders in XMI. 
view updates, studying the efficiency issue of locating the 
updated nodes at the view level and locating the related tuples at 
the database level, and addressing the concurrency-control issue 
When completely implemented,, we expect this update 
architecture to be the first M-featured XML view-update 
system for XMI middleware 
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